Mike's slide deck:


*** Lecture 1:

Make point that you can't represent many distributions

Laplace approximation (saddle-point approximation)
inc. multivariate

Define variational - just minimising the distance.
Defines functional, gives Entropy as example.
Calculus of variations: Derivatives of functional as the function is changed.
Hence name.

Present maths to get usual form log p(D) = L(Q(theta)) + KL(q||P(theta|D))
Give terms: fixed / variational lower bound / KL divergence
Observation: inc lower bound -> reduce KL.

Behaviour of KL, as measure of difference
Bishops figures
Then multimodal, how there are two answers
Presumably explains why it's the correct way around.

Next lecture is mean field (doesn't actually say mean field)


*** Lecture 2:

Recap of last lecture (5 slides)

Gives factorised Q = mean field
Idea that this approach is automatic - choose factorisation, get equations.

Derives mean field equation
Gives convergence properties (lower bound always increases)

Univariate Gaussian example (full derivation)

Linear regression (full derivation)

Ends on notes/some examples/using lower bound for model selection.

